248 research outputs found

    Comparative genomics suggests limited variability and similar evolutionary patterns between major clades of SARS-CoV-2

    Get PDF
    Phylogenomic analysis of SARS-CoV-2 as available from publicly available repositories suggests the presence of 3 prevalent groups of viral episomes (super-clades), which are mostly associated with outbreaks in distinct geographic locations (China, USA and Europe). While levels of genomic variability between SARS-CoV-2 isolates are limited, to our knowledge, it is not clear whether the observed patterns of variability in viral super-clades reflect ongoing adaptation of SARS-CoV-2, or merely genetic drift and founder effects. Here, we analyze more than 1100 complete, high quality SARS-CoV-2 genome sequences, and provide evidence for the absence of distinct evolutionary patterns/signatures in the genomes of the currently known major clades of SARS-CoV-2. Our analyses suggest that the presence of distinct viral episomes at different geographic locations are consistent with founder effects, coupled with the rapid spread of this novel virus. We observe that while cross species adaptation of the virus is associated with hypervariability of specific protein coding regions (including the RDB domain of the spike protein), the more variable genomic regions between extant SARS-CoV-2 episomes correspond with the 3\u2019 and 5\u2019 UTRs, suggesting that at present viral protein coding genes should not be subjected to different adaptive evolutionary pressures in different viral strains. Although this study can not be conclusive, we believe that the evidence presented here is strongly consistent with the notion that the biased geographic distribution of SARSCoV-2 isolates should not be associated with adaptive evolution of this novel pathogen

    Comparative genomics provides an operational classification system and reveals early emergence and biased spatio-temporal distribution of SARS-CoV-2

    Get PDF
    Effective systems for the analysis of molecular data are of fundamental importance for real-time monitoring of the spread of infectious diseases and the study of pathogen evolution. While the Nextstrain and GISAID portals offer widely used systems for the classification of SARS-CoV-2 genomes, both present relevant limitations. Here we propose a highly reproducible method for the systematic classification of SARS-CoV-2 viral types. To demonstrate the validity of our approach, we conduct an extensive comparative genomic analysis of more than 20,000 SARS-CoV-2 genomes. Our classification system delineates 12 clusters and 4 super-clusters in SARS-CoV-2, with a highly biased spatio-temporal distribution worldwide, and provides important observations concerning the evolutionary processes associated with the emergence of novel viral types. Based on the estimates of SARS-CoV-2 evolutionary rate and genetic distances of genomes of the early pandemic phase, we infer that SARS-CoV-2 could have been circulating in humans since August-November 2019. The observed pattern of genomic variability is remarkably similar between all clusters and super-clusters, being UTRs and the s2m element, a highly conserved secondary structure element, the most variable genomic regions. While several polymorphic sites that are specific to one or more clusters were predicted to be under positive or negative selection, overall, our analyses also suggest that the emergence of novel genome types is unlikely to be driven by widespread convergent evolution and independent fixation of advantageous substitutions. While, in the absence of rigorous experimental validation, several questions concerning the evolutionary processes and the phenotypic characteristics (increased/decreased virulence) remain open, we believe that the approach outlined in this study can be of relevance for the tracking and functional characterization of different types of SARS-CoV-2 genomes

    The Plant NF-Y DNA Matrix In Vitro and In Vivo

    Get PDF
    Nuclear Factor Y (NF-Y) is an evolutionarily conserved trimer formed by a Histone-Fold Domain (HFD) heterodimeric module shared by core histones, and the sequence-specific NF-YA subunit. In plants, the genes encoding each of the three subunits have expanded in number, giving rise to hundreds of potential trimers. While in mammals NF-Y binds a well-characterized motif, with a defined matrix centered on the CCAAT box, the specificity of the plant trimers has yet to be determined. Here we report that Arabidopsis thaliana NF-Y trimeric complexes, containing two different NF-YA subunits, bind DNA in vitro with similar affinities. We assayed precisely sequence-specificity by saturation mutagenesis, and analyzed genomic DNA sites bound in vivo by selected HFDs. The plant NF-Y CCAAT matrix is different in nucleotides flanking CCAAT with respect to the mammalian matrix, in vitro and in vivo. Our data point to flexible DNA-binding rules by plant NF-Ys, serving the scope of adapting to a diverse audience of genomic motifs

    VINYL: Variant prIoritizatioN bY survivaL analysis

    Get PDF
    Motivation: Clinical applications of genome re\uadsequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance, especially when large volumes of data \uad like those produced by modern sequencing technologies \uad are involved. Results: In this paper, we present VINYL, a highly accurate and fully automated system for the functional annotation and prioritization of genetic variants in large scale clinical studies. Extensive analyses of both real and simulated datasets suggest that VINYL show higher accuracy and sensitivity when compared to equivalent state of the art methods, allowing the rapid and systematic identification of potentially pathogenic variants in different experimental settings

    Definition plant microRNA primary transcripts and their splicing patterns using RNAseq

    Get PDF
    Motivation. The prediction of conserved mature microRNAs and their precursor hairpins has been addressed through several computational tools, while the detection of novel and lineage specific microRNAs is typically approached through deep sequencing of small RNA species. However, a meaningful understanding of both the regulation of miRNA transcription and the potential roles of alternative splicing in posttranscriptional regulation of microRNA biogenesis require accurate, high throughput methods to describe primary microRNA transcript structure. Methods. Given that at least most primary miRNAs in plants are believed to be transcribed by RNA polymerase II, we reasoned that, despite the expected short physiological half life of such species, ultra high-throughput sequencing of cDNA should provide evidence of primary miRNA transcripts and splicing of these molecules. We tested this hypothesis using Illumina RNAseq data from the Grapevine Vitis vinifera. Reads were mapped to the genome sequence and \u201cislands\u201d of transcription including known miRNA precursors were analysed in detail. All possible canonical splice junctions within such islands were generated computationally and used as targets for mapping of RNAseq reads that did not map to the genome sequence (reads potentially covering splice junctions). Results. We show that for many microRNA precursors, convincing estimates of primary transcript coordinates can be obtained from RNAseq data. Furthermore, estimates of splicing events obtained from our approach can often be validated experimentally. Our data suggest that splicing and alternative splcing of primary miRNAs may be widespread, at least in the grapevine, and that alternative splicing may represent a mechanism of post-transcriptional regulation of miRNA biogenesis

    CONSTANS imparts DNA sequence specificity to the histone fold NF-YB/NF-YC dimer

    Get PDF
    Nuclear Factor Y (NF-Y) is a heterotrimeric transcription factor that binds CCAAT elements. The NF-Y trimer is composed of a Histone Fold Domain (HFD) dimer (NF-YB/NF-YC) and NF-YA, which confers DNA sequence specificity. NF-YA shares a conserved domain with the CONSTANS, CONSTANS-LIKE, TOC1 (CCT) proteins. We show that CONSTANS (CO/B-BOX PROTEIN1 BBX1), a master flowering regulator, forms a trimer with Arabidopsis thaliana NF-YB2/NF-YC3 to efficiently bind the CORE element of the FLOWERING LOCUS T promoter. We term this complex NF-CO. Using saturation mutagenesis, electrophoretic mobility shift assays, and RNA-sequencing profiling of co, nf-yb, and nf-yc mutants, we identify CCACA elements as the core NF-CO binding site. CO physically interacts with the same HFD surface required for NF-YA association, as determined by mutations in NF-YB2 and NF-YC9, and tested in vitro and in vivo. The co-7 mutation in the CCT domain, corresponding to an NF-YA arginine directly involved in CCAAT recognition, abolishes NF-CO binding to DNA. In summary, a unifying molecular mechanism of CO function relates it to the NF-YA paradigm, as part of a trimeric complex imparting sequence specificity to HFD/DNA interactions. It is likely that members of the large CCT family participate in similar complexes with At-NF-YB and At-NF-YC, broadening HFD combinatorial possibilities in terms of trimerization, DNA binding specificities, and transcriptional regulation

    CONSTANS imparts DNA sequence specificity to the histone fold NF-YB/NF-YC dimer

    Get PDF
    Nuclear Factor Y (NF-Y) is a heterotrimeric transcription factor that binds CCAAT elements. The NF-Y trimer is composed of a Histone Fold Domain (HFD) dimer (NF-YB/NF-YC) and NF-YA, which confers DNA sequence specificity. NF-YA shares a conserved domain with the CONSTANS, CONSTANS-LIKE, TOC1 (CCT) proteins. We show that CONSTANS (CO/B-BOX PROTEIN1 BBX1), a master flowering regulator, forms a trimer with Arabidopsis thaliana NF-YB2/NF-YC3 to efficiently bind the CORE element of the FLOWERING LOCUS T promoter. We term this complex NF-CO. Using saturation mutagenesis, electrophoretic mobility shift assays, and RNA-sequencing profiling of co, nf-yb, and nf-yc mutants, we identify CCACA elements as the core NF-CO binding site. CO physically interacts with the same HFD surface required for NF-YA association, as determined by mutations in NF-YB2 and NF-YC9, and tested in vitro and in vivo. The co-7 mutation in the CCT domain, corresponding to an NF-YA arginine directly involved in CCAAT recognition, abolishes NF-CO binding to DNA. In summary, a unifying molecular mechanism of CO function relates it to the NF-YA paradigm, as part of a trimeric complex imparting sequence specificity to HFD/DNA interactions. It is likely that members of the large CCT family participate in similar complexes with At-NF-YB and At-NF-YC, broadening HFD combinatorial possibilities in terms of trimerization, DNA binding specificities, and transcriptional regulation

    Aging without disorder on long time scales

    Full text link
    We study the Metropolis dynamics of a simple spin system without disorder, which exhibits glassy dynamics at low temperatures. We use an implementation of the algorithm of Bortz, Kalos and Lebowitz \cite{bortz}. This method turns out to be very efficient for the study of glassy systems, which get trapped in local minima on many different time scales. We find strong evidence of aging effects at low temperatures. We relate these effects to the distribution function of the trapping times of single configurations.Comment: 8 pages Revtex, 7 figures uuencoded (Revised version: the figures are now present

    RNA sequencing of Populus x canadensis roots identifies key molecular mechanisms underlying physiological adaption to excess zinc

    Get PDF
    Populus x canadensis clone I-214 exhibits a general indicator phenotype in response to excess Zn, and a higher metal uptake in roots than in shoots with a reduced translocation to aerial parts under hydroponic conditions. This physiological adaptation seems mainly regulated by roots, although the molecular mechanisms that underlie these processes are still poorly understood. Here, differential expression analysis using RNA-sequencing technology was used to identify the molecular mechanisms involved in the response to excess Zn in root. In order to maximize specificity of detection of differentially expressed (DE) genes, we consider the intersection of genes identified by three distinct statistical approaches (61 up- and 19 down-regulated) and validate them by RT-qPCR, yielding an agreement of 93% between the two experimental techniques. Gene Ontology (GO) terms related to oxidation-reduction processes, transport and cellular iron ion homeostasis were enriched among DE genes, highlighting the importance of metal homeostasis in adaptation to excess Zn by P. x canadensis clone I-214. We identified the up-regulation of two Populus metal transporters (ZIP2 and NRAMP1) probably involved in metal uptake, and the down-regulation of a NAS4 gene involved in metal translocation. We identified also four Fe-homeostasis transcription factors (two bHLH38 genes, FIT and BTS) that were differentially expressed, probably for reducing Zn-induced Fe-deficiency. In particular, we suggest that the down-regulation of FIT transcription factor could be a mechanism to cope with Zn-induced Fe-deficiency in Populus. These results provide insight into the molecular mechanisms involved in adaption to excess Zn in Populus spp., but could also constitute a starting point for the identification and characterization of molecular markers or biotechnological targets for possible improvement of phytoremediation performances of poplar trees

    SMRT long reads and Direct Label and Stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica)

    Get PDF
    Background: The barn swallow (Hirundo rustica) is a migratory bird that has been the focus of a large number of ecological, behavioral, and genetic studies. To facilitate further population genetics and genomic studies, we present a reference genome assembly for the European subspecies (H. r. rustica). Findings: As part of the Genome10K effort on generating high-quality vertebrate genomes (Vertebrate Genomes Project), we have assembled a highly contiguous genome assembly using single molecule real-time (SMRT) DNA sequencing and several Bionano optical map technologies. We compared and integrated optical maps derived from both the Nick, Label, Repair, and Stain technology and from the Direct Label and Stain (DLS) technology. As proposed by Bionano, DLS more than doubled the scaffold N50 with respect to the nickase. The dual enzyme hybrid scaffold led to a further marginal increase in scaffold N50 and an overall increase of confidence in the scaffolds. After removal of haplotigs, the final assembly is approximately 1.21 Gbp in size, with a scaffold N50 value of more than 25.95 Mbp. Conclusions: This high-quality genome assembly represents a valuable resource for future studies of population genetics and genomics in the barn swallow and for studies concerning the evolution of avian genomes. It also represents one of the very first genomes assembled by combining SMRT long-read sequencing with the new Bionano DLS technology for scaffolding. The quality of this assembly demonstrates the potential of this methodology to substantially increase the contiguity of genome assemblies
    • …
    corecore